The gibbonsecr software package uses Spatially Explicit Capture–Recapture (SECR) methods to estimate the density of gibbon populations from acoustic survey data. This manual begins with a brief introduction to the theory behind SECR and then describes the main components of the software.
Over the past decade SECR has become an increasingly popular tool for wildlife population assessment and has been used to analyse survey data for a wide range of animal groups. The main advantage it has over traditional capure-recapture techniques is that it allows direct estmation of population density rather than abundance. Traditional capture-recapture methods can only provide density estimates through the use of separate estimates (or assumptions) about the size of the sampled area. In SECR however, density is estimated from the survey data by using information contained in the pattern of the recapture data (relative to the locations of the detectors) to make inferences about the spatial location of animals. By extracting spatial information in this way SECR can provide direct estimates of density without requiring the exact locations of the detected animals to be known in advance.
The basic data collection setup for an SECR analysis consists of a spatial array of detectors. Detectors come in a variety of different forms, including traps which physically detain the animals, and proximity detectors which do not. Using proximity detectors it is possible for an animal to be detected at more than one detector (i.e. recaptured) during a single sampling occasion.
The plot below shows a hypothetical array of proximity detectors, with red squares representing detections of the same animal (or the same group in the case of gibbons surveys) and black squares representing no detections.
The pattern of the detections (i.e. the pattern of the recapture data) gives us information about the true location of the animal/group; intutitively we would guess that it is probably near the cluster of red detectors. The plot below shows a set of probability contours for this unknown location, given the recapture data.
In the case of acoustic gibbon surveys the listening posts can be treated as proximity detectors and the same logic can be applied to infer the unknown locations of the detected groups. However, the design shown in the figure above would obviously be impractical for gibbon surveys. The next figure shows probability contours for a more reslistic array of listening posts where a group has been detected at two of the posts.
As you probably guessed from the previous section, using fewer detectors results in less information on the unknown locations. Fortunately however, SECR also allows supplementary information on group location to be included in the analysis – for example in the form of estimated bearings to the detected animals/groups. The next figure illustrates how taking account of information contained in the estimated bearings can provide better quality information on animal/group locations.
Using estimated bearings in this way can lead to density estimates that are less biased and more precise than using recapture data alone. Since the precision of bearing estimates is usually unknown, SECR methods estimate it from the data. This requires the choice of a bearing error distribution. The figure below shows two common choices of distribution for modelling bearing errors – the von Mises and the wrapped Cauchy – where the color of the lines indicates the value of the precision parameter (SECR estimates the value of this parameter from the survey data).
The wrapped Cauchy is likely to perform better when there are a small number of large errors but when most of the estimates are close to the truth. The von Mises is likely to perform better there are fewer large errors.
Another key feature of SECR is that the probability of detecting a (calling) gibbon group at a given location is modelled as a function of distance from the detector. This function – referred to as the detection function – is typically assumed to belong to one of two main types of function: the half normal or the hazard rate. The specific shape of the detection function depends on the value of its parameters, which need to be estimated from the survey data. The half normal has two parameters: g0 and sigma. The g0 parameter gives the probability at zero distance and the sigma parameter controls the width of the function. The hazard rate has three parameters: g0, sigma and z. The z parameter controls the shape of the ‘shoulder’ and adds a greater degree of flexibility. The figure below illustrates the shape of these detection functions for a range of parameter values.
The association of a detection function with each detector allows the overall probability of detection by at least one detector during the survey to be calculated for any given animal/group location. The figure below illustrates this idea of overall detection probability using a heatmap of a detection surface.
The region near the centre of the surface is close to the detector array and has the highest detection probability. E.g. in the figure above, an animal/group near to the detectors will almost certainly be detected. This probability declines as distance from the detectors increases.
The shape of the detection surface is related to the size of the effective sampling area. Since the region close to the detectors has a very high detection probability, most animals/groups within this region will be detected and this region will therefore be almost perfectly sampled. However, regions where the detection probability is less than 1 will not be compltely sampled as some animal/groups in these areas will be missed. The figure below illustrates this idea for a series of arbitrary detection surfaces.
The first plot in this figure shows a flat surface where the detection probability is 0.5 everywhere. In this scenario every animal/group has a 50% chance of being detected. If the area covered by the surface was 10km2, then the effective sampling area would be 10km2 x 0.5 = 5km2. Using this detection process we would expect to detect the same number of animals/groups as we would if we perfectly sampled an area of 5km2. In the second plot in the figure above half of the area is sampled perfectly and the other half is not sampled at all, so this has the same effective sampling area as the first plot. The third plot has a detection gradient and isn’t as intuitive to interpret. However, the way we calculate the effective survey area is to calculate the volumn under the detection surface. The third plot has the same volume as the other two, so it has the same effective area.
(Back to top of section)
(Back to contents)
3.1 Launch from R
— 3.1.1 Install R
— 3.1.2 Install prerequisite R packages
— 3.1.3 Install the gibbonsecr package
— 3.1.4 Launch the user interface
3.2 Launch from a desktop icon
— 3.2.1 Download the files
— 3.2.2 Make a shortcut icon
There are currently two ways to install the gibbonsecr software: (i) by installing a statistical software package called R from which you can launch the user interface; or (ii) by downloading a pre-compiled version of R and adding a shortcut icon for the user interface to the desktop.
Note to Mac users: Before you begin the installation process you need make sure you have the XQuartz software (also known as X11) on your machine. Users of OS X 10.5 (Leopard), 10.6 (Snow Leopard) and 10.7 (Lion) should already have this installed by default (to check, look for the X11.app application in your applications folder). Users of OS X 10.8 (Mountain Lion), 10.9 (Mavericks) and 10.10 (Yosemite) to install it manually.
Make sure you have the latest version of R installed.
Optionally you can also install something called RStudio which acts as an interface to R and is more user-friendly (it has syntax highlighting and auto-completion for example).
The gibbonsecr package uses some other R packages which don’t come with the default version of R, so you’ll need to install them manually by typing (or cutting and pasting) the code below into the R console.
install.packages(c("CircStats", "fields", "MASS", "nlme", "secr", "tcltk2"))Once the prequisite packages are installed you can install the gibbonsecr package. It’s currently hosted on GitHub but you can download it and install it by running the code below.
Windows users:
install.packages("https://github.com/dkidney/gibbonsecr/raw/master/binaries/gibbonsecr_1.0.zip",
repos = NULL, type = "win.binary")Mac users:
install.packages("https://github.com/dkidney/gibbonsecr/raw/master/binaries/gibbonsecr_1.0.tgz",
repos = NULL, type = "mac.binary")You only need to run the above steps once. When everything is installed you can launch the user interface by opening R (or RStudio) and typing the following lines into the console.
library(gibbonsecr)
gibbonsecr_gui()4.1 CSV files
— 4.1.1 Detections
— 4.1.2 Posts
— 4.1.3 Covariates
The first step in conducting an analysis is to import your survey data. This is done via the Data tab.
**SCREENSHOT**
As a minumim you need to prepare a detections file and a posts file. You can also include an optional covariates file. Advice on how to structure these files is given in the sections below. All raw data files need to be in .csv format. The file paths to your data files can be entered manually into the text entry boxes in the CSV files section, or you can navigate to the file path using the ... button.
The detections file contains a record of each detection, with one row per detection. For example, if group 1 was recorded at listening posts A and B then this would count as 2 detections. This file needs to have the following columns:
The screenshot below shows an example detections file for a one-day (i.e. single-occasion) survey.
The posts file contains information on the location and usage of the listening posts. This file needs to have one row per listening post and should contain the following columns:
101 in the usage column for that row. Each row in the usage column should contain the same number of digits.The screenshot below shows an example posts file for a one-day survey.
The covariates file contains information on environmental and other variables associated with the survey data. This file needs to have one row per day for each listening post and should contain the following columns:
These columns can all be used as covariates themselves, but any additional covariates should be added using additional columns. Use underscores _ instead of full stops for the covariate names.
The screenshot below shows an example covariates file for a one-day survey.
Once the paths to the data files have been entered, select the relevent units from the Data details dropdown boxes for your estimated bearings data (and estimated distances data if it was collected). (Note that the current version of the software only allows Type = continuous since interval methods for bearings and distances haven’t yet been implemented.)
5.1 Mask size and resolution
— 5.1.1 Buffer
— 5.1.2 Spacing
5.2 SHP files
— 5.2.1 Region
— 5.2.2 Habitat
The SECR model fitting procedure requires the use of a mask which is a fine grid of latitude and longitude coordinates around each array of listening posts. When an SECR model is fitted, the mask is used to provide a set of candidate locations for each detected group. It is important to use a suitable mask to avoid unreliable results.
There are two main settings you need to consider when defining a mask – the buffer and the spacing – which you can specify in the Mask tab.
The buffer defines the maximum distance between the mask points and the listening posts. It needs to be large enough so that the region it encompasses contains all plausible locations for the detected groups, but it shouldn’t be unnecessarily large. Buffer distances that are too small will lead to underestimates of the effective sampling area and overestimates of density. However, increasing the buffer distance also increases the number of mask points, which means that the models will take longer to run, so the buffer also shouldn’t be larger than it needs to be. The ideal buffer distance is the distance at which the overall detection probability drops to zero.
A good way to check whether the buffer distance is large enough is to look at the detection surface, which you can plot after fitting a model (see the section on plotting results). The detection surface plot produced by gibbonsecr is the same size as the mask, so the colour at the edge of the plot will show you the overall detection probability at the buffer distance. If the detection probability is greater than zero at the buffer distance then you should increase the buffer distance, re-fit the model and re-check the detection surface plot.
To illustrate this issue, the figure below shows a series of detection surfaces from models that were fitted using mask buffers of 1000m, 10000m and 5000m respectively.
The buffer plot 1 looks to be too small, since the detection probability at the buffer distance is much greater than zero. In this case it is extremely likely that the true locations of some of the detected groups were actually outside the buffer zone. In plot 2 the buffer has been increased to 10000m. The detection probabiilty at the buffer distance looks to be at zero so we would expect the density estimate to be unbiased. The density estimate in plot 2 is about 75% lower than the estimate in plot 1, which suggests that the estimate in plot 1 is a big overestimate. The buffer distance plot 3 is intermediate between the other two. The detection probability is still zero at the buffer distance, but the estimated density is very similar to plot 2, so it doesn’t look to be biased. In this case the mask in plot 3 would be preferred since (for a given resolution) it will be much quicker to fit models than the mask in plot 2 whilst still giving reliable results.
The buffer spacing is the distance between adjacent mask points. Decreasing the spacing will therefore increase the resolution and increase the total number of mask points. Smaller spacings provide a greater number of candidate locations and lead to more reliable results. However, increasing the number of mask points has a cost in terms of computing time and if the spacing is too small then models may take a very long time to run. As a general rule of thumb, try to use the smallest spacing that is practical given the speed of your computer, but try not to use spacings larger than 250m.
The Mask tab also allows you to upload shapefiles in order to attach spatial covariate values to each of the mask points for use in model formulas.
6.2 Model parameters
— 6.2.1 Formulas
— 6.2.1 Fixing parameter values
Once you have made a mask you can move on the Model tab and start fitting some SECR models.
**SCREENSHOT**
Specifying a model is split into two steps: (i) choosing what kind of detection function and bearing error distribution you want to use, and (ii) deciding whether to fix any parameter values or model them using the available covariates. These steps are described in more detail below.
The first section in the Model tab contains dropdown boxes where you can choose between different detection functions and different distributions for the estimated bearings and distances.
Setting the bearings/distances distribution to none means that the bearings/distances data will be ignored in the analysis. Setting both bearings and distances distributions to none will result in a conventional SECR model being fitted using only the recapture data.
The next section in the Model tab provides various options for refining your model. Each row in this section relates to a particular parameter in the SECR model.
D – The number of groups per square kilometerg0 – The detection function intercept parameter (see Section 2.4 )sigma – The detection function scale parameter (see Section 2.4 )bearings – The parameter of the distribution for the bearing errors (see Section 2.2 )distances – The parameter of the distribution for the estimated distances (see Section 2.3 )pcall – The probablity of a group calling on a given dayDon’t worry if you forget these definitions, hovering your cursor over the row labels on the user interface will open a temporary help box to give you a reminder.
(Back to top of section)
(Back to contents)
If you wish to specify a formula for a particular component you need to click the radio button on the right hand side of the Formula entry box for that component to activate the entry box. If the radio button is clicked but the box is left empty then the default formula using no covariates (i.e. an intercept-only model) will be assumed. If you wish to specify a model using covariates then you need to type in the names of the covariates you wish to use, separated by + signs. E.g. to model sigma in terms of habitat and weather then you would type,
habitat + weather
in the formula box for sigma.
A note to experienced R-users: For those of you who are familiar with R formula syntax, more complicated formula syntax (such as the use of as.factor and as.numeric to coerce variables, and -1 to change the model contrasts) are NOT supported. Howerver the use of gam functions s, te, ti and t2 (from the mgcv package) for numeric variables IS supported.
Sometimes you many not want or need to estimate a particular parameter, in which case you can fix its value. To do this, click on the radio button on the right hand side of the Fixed entry box and type the value of the parameter in the box.
g0
Note that g0 should be fixed at 1 for one-day surveys. This is because the probability of detecting a calling group at zero distance from a listening post is extraordinarily unlikely to be anything other than 1 (since you will be standing under the tree it’s calling from). However, this is NOT the case for multi-day surveys. In a multi-day survey the movement of groups between consecutive sampling days means that we have to redefine group ‘location’ as being the average location of the group. As a result, g0 needs to be interpreted as the probability of detecting a calling group at zero distance from the average location – in other words, the probability of detecting a calling group when the listening post is at the average location. Since a group is unlikely to always be at its average location during a multi-day survey (unless it happens not to move) this probability is unlikely to be 1 and the g0 parameter should be estimated.
pcall
Note also that pcall is fixed at 1 by default. For one-day surveys, pcall can’t be estimated so the only option is to provide a fixed value. Fixing pcall to 1 for one-day surveys means that the D parameter should be interpreted as the density of calling groups, rather than the density of groups. However, if prior knowledge of the calling probability is available you can type this into the Fixed box, in which case the density parameter can be reinterpreted as the density of groups. For one-day surveys changing the pcall value will result in a direct scaling of the density estimate. For example, if you had an estimated calling group density of 5, changing the fixed value for pcall to 0.5 and re-fitting the model would result in a group density estimate of 10. For multi-day surveys the D parameter is always interpreted as the density of groups. In this case you can either fix the value of pcall or estimate it from the data. Fixing pcall to 1 for a multi-day survey is likely to result in an underestimate of group density.
Bear in mind that fixing a parameter value will often lead to a more precise density estimate. If the fixed parmater is known with certainty then this would be a desirable effect. However, if there is uncertainty over the true value of that parameter (e.g. you may have used an estimate from a previous study) then this will not be incorporated into the undertainty of the density estimate and the precision of the density estimate will therefore be overestimated.
(Back to top of section)
(Back to contents)